插入整孔组件是一项艰巨的任务。由于孔的公差很小,插入中的较小错误将导致故障。这些故障会损坏组件,并且需要手动干预才能恢复。错误可能来自不精确的对象抓取和弯曲销。因此,重要的是,系统必须准确地确定对象的位置并拒绝弯曲销的组件。通过利用对象中固有的约束,使用模板匹配的方法可以获得非常精确的姿势估计。还实施了针对销检查的方法,并显示了成功的方法。该设置是自动执行的,具有两个新颖的贡献。对引脚进行深度学习分割,并通过模拟发现检查姿势。从检查姿势和分段引脚中,然后生成用于姿势估计和引脚检查的模板。为了训练深度学习方法,创建了分段整个孔组件的数据集。该网络在测试集上显示97.3%的精度。还在插入CAD模型上测试了PIN分割网络,并成功地分割了引脚。完整的系统在三个不同的对象上进行了测试,并且实验表明该系统能够成功插入所有对象。通过手动纠正错误和用弯曲销的对象拒绝对象。
translated by 谷歌翻译
姿势估计是确定场景中对象的6D位置的任务。姿势估计有助于机器人设置的能力和灵活性。但是,必须将系统配置为用例,以充分执行。这种配置是耗时的,并限制了姿势估计的可用性,从而限制了机器人系统。深度学习是一种通过直接从数据集学习参数来克服此配置过程的方法。但是,获得此培训数据也可能非常耗时。合成训练数据的使用避免了此数据收集问题,但是需要对训练程序进行配置来克服域间隙问题。此外,还需要配置姿势估计参数。这种配置被开玩笑地称为研究生下降,因为参数被手动调整,直到获得令人满意的结果为止。本文介绍了一种仅使用合成数据自动配置的方法。这是通过学习网络训练期间的域随机化,然后使用域随机化来优化姿势估计参数来实现的。开发的方法显示了在具有挑战性的遮挡数据集中的最新性能82.0%的召回率,超过了所有以前的方法。这些结果证明了使用纯合成数据自动设置姿势估计的有效性。
translated by 谷歌翻译
本文介绍了一种新颖的网络体系结构,用于大规模点云的分类。该网络用于从楔形文字片中对元数据进行分类。由于超过一百万平板电脑仍未得到处理,这可以帮助创建平板电脑的概述。该网络在比较数据集上进行了测试,并获得了最先进的性能。我们还介绍了新的元数据分类任务,网络在其中显示出令人鼓舞的结果。最后,我们介绍了新颖的最大注意力可视化,表明训练有素的网络侧重于预期的功能。代码可在https://github.com/fhagelskjaer/dlc-cuneiform上获得
translated by 谷歌翻译
Segmentation of lidar data is a task that provides rich, point-wise information about the environment of robots or autonomous vehicles. Currently best performing neural networks for lidar segmentation are fine-tuned to specific datasets. Switching the lidar sensor without retraining on a big set of annotated data from the new sensor creates a domain shift, which causes the network performance to drop drastically. In this work we propose a new method for lidar domain adaption, in which we use annotated panoptic lidar datasets and recreate the recorded scenes in the structure of a different lidar sensor. We narrow the domain gap to the target data by recreating panoptic data from one domain in another and mixing the generated data with parts of (pseudo) labeled target domain data. Our method improves the nuScenes to SemanticKITTI unsupervised domain adaptation performance by 15.2 mean Intersection over Union points (mIoU) and by 48.3 mIoU in our semi-supervised approach. We demonstrate a similar improvement for the SemanticKITTI to nuScenes domain adaptation by 21.8 mIoU and 51.5 mIoU, respectively. We compare our method with two state of the art approaches for semantic lidar segmentation domain adaptation with a significant improvement for unsupervised and semi-supervised domain adaptation. Furthermore we successfully apply our proposed method to two entirely unlabeled datasets of two state of the art lidar sensors Velodyne Alpha Prime and InnovizTwo, and train well performing semantic segmentation networks for both.
translated by 谷歌翻译
Explainable AI (XAI) is slowly becoming a key component for many AI applications. Rule-based and modified backpropagation XAI approaches however often face challenges when being applied to modern model architectures including innovative layer building blocks, which is caused by two reasons. Firstly, the high flexibility of rule-based XAI methods leads to numerous potential parameterizations. Secondly, many XAI methods break the implementation-invariance axiom because they struggle with certain model components, e.g., BatchNorm layers. The latter can be addressed with model canonization, which is the process of re-structuring the model to disregard problematic components without changing the underlying function. While model canonization is straightforward for simple architectures (e.g., VGG, ResNet), it can be challenging for more complex and highly interconnected models (e.g., DenseNet). Moreover, there is only little quantifiable evidence that model canonization is beneficial for XAI. In this work, we propose canonizations for currently relevant model blocks applicable to popular deep neural network architectures,including VGG, ResNet, EfficientNet, DenseNets, as well as Relation Networks. We further suggest a XAI evaluation framework with which we quantify and compare the effect sof model canonization for various XAI methods in image classification tasks on the Pascal-VOC and ILSVRC2017 datasets, as well as for Visual Question Answering using CLEVR-XAI. Moreover, addressing the former issue outlined above, we demonstrate how our evaluation framework can be applied to perform hyperparameter search for XAI methods to optimize the quality of explanations.
translated by 谷歌翻译
Autonomous vehicles currently suffer from a time-inefficient driving style caused by uncertainty about human behavior in traffic interactions. Accurate and reliable prediction models enabling more efficient trajectory planning could make autonomous vehicles more assertive in such interactions. However, the evaluation of such models is commonly oversimplistic, ignoring the asymmetric importance of prediction errors and the heterogeneity of the datasets used for testing. We examine the potential of recasting interactions between vehicles as gap acceptance scenarios and evaluating models in this structured environment. To that end, we develop a framework facilitating the evaluation of any model, by any metric, and in any scenario. We then apply this framework to state-of-the-art prediction models, which all show themselves to be unreliable in the most safety-critical situations.
translated by 谷歌翻译
Geospatial Information Systems are used by researchers and Humanitarian Assistance and Disaster Response (HADR) practitioners to support a wide variety of important applications. However, collaboration between these actors is difficult due to the heterogeneous nature of geospatial data modalities (e.g., multi-spectral images of various resolutions, timeseries, weather data) and diversity of tasks (e.g., regression of human activity indicators or detecting forest fires). In this work, we present a roadmap towards the construction of a general-purpose neural architecture (GPNA) with a geospatial inductive bias, pre-trained on large amounts of unlabelled earth observation data in a self-supervised manner. We envision how such a model may facilitate cooperation between members of the community. We show preliminary results on the first step of the roadmap, where we instantiate an architecture that can process a wide variety of geospatial data modalities and demonstrate that it can achieve competitive performance with domain-specific architectures on tasks relating to the U.N.'s Sustainable Development Goals.
translated by 谷歌翻译
封闭的量子机械系统的物理学受哈密顿量的约束。但是,在大多数实际情况下,这种哈密顿量尚不清楚,最终所有的数据是从系统上的测量中获得的数据。在这项工作中,我们通过将基于机器学习的基于梯度的优化从机器学习中从张量量的网络中从机器学习中从基于梯度的优化中汇总到从基于梯度的优化的技术中汇总到从动力学数据中进行交互的多体汉密尔顿人来学习的家庭。我们的方法非常实用,实验友好且本质上可扩展,以使系统尺寸超过100次旋转。特别是,我们在综合数据上证明了算法的工作原理,即使仅限于一个简单的初始状态,少量的单量观测和时间演变为相对较短的时间。对于一维海森贝格模型的具体示例,我们的算法在系统大小和缩放的误差常数中作为数据集大小的反平方根。
translated by 谷歌翻译
从不同的随机初始化开始,经过随机梯度下降(SGD)训练的神经网络通常在功能上非常相似,从而提出了一个问题,即不同的SGD溶液之间是否存在有意义的差异。 Entezari等。最近猜想,尽管初始化不同,但在考虑到神经网络的置换不变性后,SGD发现的解决方案位于相同的损失谷中。具体而言,他们假设可以将SGD找到的任何两种解决方案排列,以使其参数之间的线性插值形成一条路径,而不会显着增加损失。在这里,我们使用一种简单但功能强大的算法来找到这样的排列,使我们能够获得直接的经验证据,证明该假设在完全连接的网络中是正确的。引人注目的是,我们发现在初始化时已经存在两个网络,并且平均它们随机,但适当排列的初始化的性能大大高于机会。相反,对于卷积架构,我们的证据表明该假设不存在。特别是在大型学习率制度中,SGD似乎发现了各种模式。
translated by 谷歌翻译
来自原子模拟数据的重建力场(FF)是一个挑战,因为准确的数据可能非常昂贵。在这里,机器学习(ML)模型可以帮助成为数据经济,因为可以使用基础对称性和物理保护定律成功限制它们。但是,到目前为止,每个针对ML模型新提出的描述符都需要进行繁琐且数学繁琐的重塑。因此,我们建议在ML建模过程中使用来自算法分化的现代技术 - 有效地以更高的计算效率的阶顺序自动地使用新颖的描述符或模型。这种范式的方法不仅可以使新的表示形式的多功能用法,对FF社区的有效计算(对FF社区的高价值都高),而且还可以简单地包含进一步的物理知识,例如高阶信息(例如〜Hessians) ,更复杂的部分微分方程约束等),甚至超出了提出的FF域。
translated by 谷歌翻译